Main Findings

Column

Scatterplot: Musical Sophistication and Valence Estimation Error

Column

Interactive Experiment: Intro

We challenge you to beat our participants! In this experiment demo you get to experience what our research is about in an interactive way. In the following tabs we have included short versions of the different parts in our survey. First, you get to make a mini Gold-MSI test. The result places you on the musical sophistication scale. The higher your score, the more knowledge and experience you have with music. The second part consists of a few songs that we used in the survey. It is up to you to determine the valence value of each song to the best of your ability! For both the Gold-MSI test and the song estimation test, the results are on the subsequent tab. After you have determined your scores, you can see in the graph where you would have approximately been in our sample and if you have surpassed the participants. How to grade yourself is further explained on the answer tabs. Good luck!

Mini-MSI

Following are a few statements to approximate your musical sophistication. The first four are statements that you are meant to be answering with a value ranging from 1 to 7, 1 meaning you completely disagree, 7 that you completely agree and 4 neither agree nor disagree. For the last question you have to choose one of the options given below it.

Musical Fragments

Listen to the following 4 music fragments and estimate the valence value you think corresponds to each fragment. These values also range from 1 to 7, with 1 meaning extremely low valence, 7 - extremely high, and 4 - neither high nor low.

Answers and Grading

First, the Mini-MSI results. Each answer is worth a maximum of 7 points. If the points are allocated in an ascending order, then the answer 7 is worth 7 points and 1 is worth 1 point. If points are allocated in a descending order, then it is exactly the other way around. For the last question it depends on which option you chose. The first option is worth 1 point, the second 2 points and the last 7 points. Divide your score by 5 and multiply it by 18 to see what your score would’ve approximately been on the full test. For the music we give the correct valence values. It is up to determine how far you were off or whether you were exactly correct. On the following tab we give you the results from our participants on these specific songs.

Background

Column

Introduction

Nowadays, music is more present in our lives than ever before. According to the International Federation of the Phonographic Society (2018), an average person spends around 18 hours a week listening to music. The majority of this time is spent on streaming platforms like Spotify, which have displayed a consistent growth in revenue over the past decade. The fact that music has become omnipresent in our lives raises the question whether this makes people better at identifying and assessing certain musical features. In other words, is there a difference between how normal and musically sophisticated individuals perceive features of music? If this would turn out to be true, this would indicate that the way we think about music, in fact, depends on our exposure to it.

Although this question has not been addressed directly in the academic literature, researchers have investigated how differences in musical engagement between people lead to differences in the perception of musical valence and arousal, with valence being defined as the positiveness of a track (Frijda, 1986), and arousal as the state of being alert or awake (Warriner et al., 1986). In their research, Olsen and colleagues (2014) conclude that such a difference exists: the variable musical engagement is a significant predictor of perceived musical arousal and valence. However, the relationship between musical sophistication and perceived musical features has not been investigated by academics at the time of writing this paper.

To fill this gap in the literature, we decided to investigate the following research question in our study: are musically sophisticated people better at estimating a musical piece’s valence? We specifically chose to scrutinize the relationship between musical sophistication and valence, as valence is a clearly defined, well studied emotional dimension in psychology. To answer our question, we make use of surveys in which we ask our participants to estimate the valence of 25 randomly selected songs. We subsequently compare their outcomes not only between each other, but also to Spotify’s valence rating of the song: our point of reference used by the world’s biggest streaming platform (Spotify, 2018).

Raw data

Column

Materials

Song pool: the song pool contains 240 songs, with 20 songs per genre. The genres include: pop, rock, metal, electronic, dance, house, hip-hop, singer/songwriter, soundtrack, R&B, soul/blues and classical music.

Randomized playlist: from the song pool, 25 songs are randomly selected for each participant to listen to and rate.

Gold-MSI: a questionnaire used to measure musical sophistication. Calculated as total number of points on the measure. Only the short version of the questionaire is used, i. e., only the items that map onto the general musical sophistication factor.

Trial round: prior to the main valence estimation task, participants are presented with two songs which they rate the valence of, and are then provided with the true valence of said songs. This is used to help the participants better conceptualize what valence is, and what the main task will look like.

Valence rating: a 7-point Likert scale, ranging from “Extremely low” (1) to “Extremely high” (7), used to estimate a song’s valence.

Familiarity measure: a measured used to indicate whether the participant is familiar with the song that they are rating. This is used in later analysis to see if familiarity affects participant’s judgements of a song’s valence. Measured as 1 (familiar) or 0 (unfamiliar).

R Shiny web app: the web app is used to construct a randomized playlist from the song pool, administer the Gold-MSI, and collect valence and familiarity data.

Song selection

In total, 242 songs were used in this experiment.

Two songs were the same for each participant, namely the trial round songs. These were selected to help participants better conceptualize musical valence. The first song had an extremely high valence value (7), whereas the second song was closer to the middle of the scale and had a slightly low value (3).

The selection process for the other 240 songs began with a breakdown of musical genres - we chose 12 genres based on a survey that was conducted using a sample of 19000 people between the age of 16-64, measuring the 12 most consumed genres of music in the U.S. (REFERENCE). For each genre 20 songs were selected making use of the website ‘rateyourmusic.com’, where albums can be sorted by their average user-submitted rating, by genre. We sampled songs from these highly-rated albums, working under the assumption that well-regarded albums in a given genre are the most representative of said genre. Each sample of songs contained only one song per artist, in a given genre.

During the selection process, we also took note of the valence of the sampled songs. We made sure that - per genre - no valence value was overrepresented or underrepresented, given what is usual for a specific genre; e. g., if a genre is generally characterized as having songs with higher valence, the song pool for said genre would be skewed towards, on average, higher values as well (compared to a song pool of a genre that is characterized as having lower valence songs).

Once the songs were selected, we trimmed each musical item’s length. As a rule of thumb, we chose the 15 seconds from one minute of playback into the song. Some of the selected songs started with an intro, however, this did not count towards the minute.

Procedure

Participants begin the experiment by completing the Gold-MSI to collect measures for musical sophistication. Participants are then directed to the main experiment, where they are first presented with a practice round. In the practice round, participants listen to two songs, and are then asked to rate each song’s valence. After each rating in the practice round, participants are presented with the song’s true valence as determined by Spotify. After this, participants proceed to the main task, which contains the randomized playlist. After listening to a song from the playlist, participants rate the valence using a 7-point Likert scale, and indicate whether they are familiar with the song.

Exploration

Row

Violin: Valence Estimation Deviation Density

Jitter: Musical Sophistication & Valence Estimation by Genre

Scatter: True Valence & Estimated Valence

Bar: Average Valence Estimation Deviation

Row

Violin Interpretation

The violin graph plots the 12 genres against the average valence estimation deviation (i.e. the average of difference between what participants estimated and Spotify’s true valence). This plot consists of two parts: one is the box plot, the other is the ‘violin’. From the box plot we can see the minimum and maximum values, as well as the first quantile, median and third quantile. The thickness of the violin indicates the number of observations for a particular valence deviation.

From this graph we can deduct whether the valences in particular genres were systematically being over- or underestimated by participants. We see that for five genres this is not the case: Electronic, Hip-Hop, Pop, R&B and Singer/Songwriter all have a median of 0. For the other genres however, the medians range from -1 to +2. The most significant being Dance and House with an average of two points estimated above the true valence. For Soundtracks we see that the first quantile and median are equal to +1, but the third quantile is +4.

Jitter Interpretation

In this jitter plot, the valence estimation error (i.e. the difference between the true valence and a participant’s estimation) is plotted against the Gold-MSI score and grouped per music genre. The estimation error can both be positive as well as negative, making it possible to explore whether some genres are systematically overvalued or undervalued. In addition, we made use of linear regression to investigate the potential relationship between the dependent and independent variable per genre. It is important to note that, as opposed to the plot on the main page, every dot in this graph represents one attempt of estimating the valence, not one participant. Because this means that we have multiple observations per MSI-value, the command geom_jitter() was used to avoid a straight line of points and make the graph more intuitive to interpret.

As becomes evident from looking at the graph, there seems to be no significant negative relationship between the valence estimation error and musical sophistication for any music genre like we hypothesized. Only the fitted lines for the genres pop, soul/blues and R&B show a very small negative slope. In fact, the genres hip-hop, rock and house even display a small positive relationship. Another interesting finding is that the genres house, dance and soundtracks seem to be systematically overvalued, as indicated by the fitted lines well above 0. This is in accordance with the findings from the violin plot and constitutes an interesting topic for further research.

Scatter Interpretation

As can be seen, some of the genres have similar graphs, but overall, the graphs differ much from each other. A single graph in which all genres are included would not lead to a concise result, therefore we opted for genre-specific graphs. For most genres there is a positive trend, which is most accurately shown by the data for R&B and Singer-Songwriter. Another noteworthy result is that for Dance and House the lower valence values are overestimated on average, for Metal the higher valence values are underestimated on average. For Electronic, Hip-Hop and Rock most average estimates are around the central value, which indicates that for these genres the positivity-grade is not clearly distinguishable.

What also can be seen is that for most genres, songs with extreme-valued valences, so 1 or 7, are not recognized by all participants the song was tested on. These results raise the question whether the valence values determined by Spotify are actually representative for the valence or positivity it evokes in people. The Soundtracks graph raises the same question, because Spotify determined all but one track to have valence value 1 or 2, but the on average judgement by the participants is very spread out.

Bar Interpretation

Discussion

Column

General conclusion

The present study sought to investigate the relationship between musical sophistication and valence estimation by comparing people’s valence estimates against values provided by Spotify. Counter to the main hypohtesis, the results showed a weak positive relationship between musical sophistication and valence estimation inaccuracy. This indicates that musical sophistication is not a predictor of accurate perception of musical valence, and that musical sophistication might even interfere with valence perception, though this should be taken with a grain (or nugget) of salt. MISSING INFO: GENRE BREAKDOWN. MISSING INFO: CORRELATION WITH SPOTIFY’S VALUES AND DISCUSSION OF THESE RESULTS.

Column

Limitations

Discuss limitations

Future research

Discuss future research into the area

Appendix

Column

References

Frijda, N. H. (1986). The emotions. Cambridge: Cambridge University Press.

International Federation of the Phonographic Society. (2018). IFPI Global Music Report 2019. Retrieved from: https://www.ifpi.org/news/IFPI-GLOBAL-MUSIC-REPORT-2019

Spotify. (2018). Spotify Technology S.A. Announces Financial Results for First Quarter 2018. Retrieved from: https://investors.spotify.com/financials/press-release-details/2018/Spotify-Technology-SA-Announces-Financial-Results-for-First-Quarter-2018/default.aspx

Warriner, A. B., Kuperman, V., & Brysbaert, M. (2013). Norms of valence, arousal, and dominance for 13,915 English lemmas. Behavior research methods, 45(4), 1191-1207.

Column

Downloadables

Links to our GitHub repositories, and to downloadables of the data we used